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ABSTRACT 

Smooth penalty functions can be combined with numerical 

continuation/bifurcation techniques to produce a class of robust and fast 
algorithms for constrained optimization problems. The key to the 
development of these algorithms is the Expanded Lagrangian System which is 
derived and analyzed in this work. This parametrized system of nonlinear 
equations contains the penalty path as a solution, provides a smooth 
homotopy into the first-order necessary conditions, and yields a global 
optimization technique. Furthermore, the inevitable ill-conditioning 
present in a sequential optimization algorithm is removed for three 
penalty methods: the quadratic penalty function for equality constraints, 

and the logarithmic barrier function (an interior method) and the 
quadratic loss function (an exterior method) for inequality constraints. 
Although these techniques apply to optimization in general and to linear 
and nonlinear programming, calculus of variations, optimal control and 
parameter identification in particular, the development is primarily 
within the context of nonlinear programming. 
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1 . Introduction 

The use of smooth penalty functions in a sequential optimization 
algorithm to solve constrained optimization problems has long been 
regarded as unfashionable and numerically defective, principally because 
of an inevitable ill-conditioning that occurs as the penalty parameter 
tends to the prescribed limit. This view, which is prevalent in most 
texts and review articles [5, 6, 9, 19, 20], is quite valid and certainly 
motivated development of other numerical optimization techniques such as 
augmented Lagrangians and exact penalty functions. Our objective herein 
is to re-examine the use of these smooth penalty functions, not in a 
sequential optimization algorithm, but in a continuation-based algorithm 
for following the penalty function path defined as a solution of a 
parametrized system of nonlinear equations. This methodology now becomes 
viable because of the extensive development of numerical 
continuation/bifurcation techniques during the last decade [1, 11, 16, 17, 
25] and because we can remove the ill-conditioning for three fundamentally 
important smooth penalty functions. These continuation methods are 
capable of producing robust algorithms competitive with the fastest 
optimization techniques currently available and yield a method for global 
optimization. 

Although the techniques discussed here apply to constrained 
optimization in general and to linear and nonlinear programming, calculus 
of variations, optimal control and parameter identification in particular, 
our main focus in this work is on the general nonlinear programming 
problem 


min{f (x) |h(x) = 0, g(x) > 0} 
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where f: lR n -♦ IR 1 , h: R n -► !R q and g: IR n -♦ JR** are assumed to be twice 
continuously differentiable. (Section 5 contains examples from linear 
programming and the calculus of variations.) Within this context, the 
strength of the combination of nonlinear equations and smooth penalty 
functions is perhaps best explained by first examining the attributes of 
these two methods for solving optimization problems. 

The use of nonlinear equations to solve constrained opimization 
problems can be based on solving the first-order necessary conditions by 
Newton's method or one of its many variants [3; 5, p. 138]. An important 
attribute of Newton's method is its speed of convergence, but there are 
many deficiencies: a good initial approximation is needed, the derivatives 
must be supplied, and the linear algebra can be expensive. Two popular 
techniques for alleviating the first difficulty are continuation 
(embedding and homotopy) and damping while quasi-Newton methods and 


sequencing the 

matrix factorizations are useful 

for 

the 

latter two 

difficulties . 

For 

optimization problems 

there 

are 

additional 

difficulties. 

The 

inequality constraints 

and 

the 

sign 

of their 


corresponding multipliers must be satisfied and, most importantly, 
convergence should be to a local optimum. 

In spite of the ill-conditioning problem, smooth penalty functions 
have been extensively investigated and are known to have exceptional 
theoretical strengths. A particularly relevant and generic result 
pertains to a penalized objective function P(x, r) where the penalty 
parameter is arranged so that r = 0 is the desired limit. If {r^} is a 
decreasing sequence of penalty parameter values converging to zero and if 
{x. } is a corresponding sequence of global minimizers, then in the 
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presence of a constraint qualification any limit point of {x^} solves the 
original problem. If (x. } is relaxed to a sequence of local minimizers, 
then the limit point is only claimed to satisfy the first-order necessary 
conditions. In practice one normally follows a path of local minimizers, 
but in spite of this deficiency smooth penalty methods have been highly 
successful [13], modulo the ill-conditioning of the Hessian, in tracking 
these local minima to a solution of the original problem. From this 
result one naturally conjectures the existence of a parametrized system of 
equations which contains the penalty path and which reduces to the 
first-order necessary conditions at r = 0. 

Such a system is the Expanded Lagrangian System (ELS) which arises in 
one variant of the method as follows. Let P(x, r) denote the penalized 
objective function. Then the penalty path(s) is described as the solution 
set of min P(x, r) as r varies and must, by the first-order necessary 
conditions, be a solution of vp = o. Then one formally identifies the 
gradient of the penalty function, vP, with that of the Lagrangian, v£, by 
identifying the multipliers. Using the definition of these multipliers as 
additional equations, one obtains an Expanded Lagrangian System which 
essentially represents a perturbation of the first-order necessary 
conditions of Karush-Kuhn-Tucker . Furthermore, the inequality constraints 
and the sign of their corresponding multipliers are satisfied, and one has 
an opportunity to obtain a local optimum. Thus, three of the problems 
associated with solving the first-order necessary conditions are resolved. 

Returning to the question of ill-conditioning, we shall show that 
there are only three smooth penalty functions that yield well-conditioned 
Expanded Lagrangian Systems and each is a method of order one (see 
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Section 4). The canonical examples of these three classes are the 
quadratic penalty function for equality constraints, logarithmic barrier 
function (an interior method) and the quadratic loss function (an exterior 
method) for inequality constraints. The remaining smooth penalty 
functions introduce artificial singularities and ill-conditioning into the 
ELS. In addition, one frequently leaves equality affine constraints out 
of the penalized objective function as is illustrated in Section 5. There 
is one additional difficulty that necessitates modification of the 
standard penalty functions. As r tends to zero, a multiplier may tend to 
infinity for two basic reasons: it may not be possible to satisfy a 
constraint or a constraint qualification may fail. This difficulty is 
resolved by using a Fritz John type ELS. 

The salient features of this class of algorithms can now be described 
easily. One first uses an unconstrained optimization technique to get on 
the penalty path in a region where the problem is reasonably well 
conditioned, say r =s 10 1 . Then we switch to the ELS and use 
pseudo-arclength predictor-solver (corrector) continuation techniques [16, 
17, 25] to follow the penalty path with one of two possible objectives. 
We may wish to get to optimality at r = 0 as quickly as possible, or 
continue in r past zero to obtain multiple optima. The global 
optimization technique arises in connection with the latter objective. 

In this work we concentrate on the development and trajectory 
analysis of the Expanded Lagrangian System since it is the key link 
between the use of smooth penalty functions and continuation methods. The 
quadratic penalty-logarithmic barrier function and the quadratic 
penalty-quadratic loss function are examined in Sections 2 and 3, 



5 


respectively. In Section 4 we show that smooth penalty functions other 
than the aforementioned three introduce singularities into the penalty 
path at r = 0 and thus ill-conditioning near r = 0. To further illustrate 
the ELS, we present examples from linear programming and the calculus of 
variations in Section 5, and conclude with a brief discussion of the 
problems and issues not treated here in Section 6. 

2. The Mixed Quadratic Penalty-Logarithmic Barrier Function 

In this section we derive and analyze the Expanded Lagrangian System 
(ELS) for the general nonlinear programming problem when the quadratic 
penalty function is used for the equality constraints and the logarithmic 
barrier function, for the inequality constraints. This ELS represents a 
perturbation of the Karush-Kuhn-Tucker first-order necessary conditions 
and we modify it to obtain a perturbation of the Fritz John conditions. 
This modification is necessitated by the fact that a multiplier may become 
unbounded as r tends to zero. In Theorem 2.2 we analyze the trajectory 
through r = 0 by giving necessary and sufficient conditions for 
nonsingularity of the ELS at r = 0. These conditions are the same as 
those of the nonlinear programming problem, and thus the ELS is just as 
well conditioned, at least for small r. 

The mixed quadratic penalty-logarithmic barrier function is 

P 

(2.1) P(x, r) = f(x) + ^h T (x)h(x) - r ^ ln^x)). 

i=l 

Assuming {x: ,g(x) > 0} to be nonempty (in general it should be robust in 
the sense of Luenberger [20]), the first-order necessary conditions for a 
minimum of P is that v^P = 0, which is a parametrized system of nonlinear 
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equations. However, the numerical problem is that the Jacobian of vp, the 

Hessian of P, becomes increasingly ill conditioned as r •* 0 + . In fact, 

2 1 + 

the condition number K„(v p) = 0(— ) as r -* 0 . A frequently used idea 
b b r 

in bifurcation theory is that a singularity can sometimes be removed by 
expanding the system, and we use this idea here to remove the 
ill-conditioning. Because of its importance we state this as a theorem 
but forego its straightforward proof. 

THEOREM 2.1 . Ifijfc f : IR n -» R 1 , h: IR n -> IR q and g : -♦ IR P be C 1 functions. 

Then when r * 0 and gjfx) * 0 (i = 1 p), (x, r) solves 

(2.2) 0 = VP := vf + (Dh) T [|] - ^vg^j 

i=l 1 

if and only if (x. A, u, r) solves the Expanded Lagrangian System 

0 = vf - (Dh) T A - (D g) T (J 

(2.3) 0 = h + rA 

0 = rg - re, r = diag^j fj p ) 

T p 

where e = (1, 1, 1) € 1R r and D is the transpose of the gradient 

operator v . 

The essence of this theorem is that we have identified the gradient 

of the penalized objective function P, vp, with the gradient of the 

Lagrangian £, v£, by introducing the (perturbed) multipliers A^ = hVr and 

fj = r/gj anc * ^ en adding these as equations. The use of this ELS (2.3) 

in an algorithm starts with the use of an unconstrained optimization 

technique to solve min P(x, r) for some value of r, say r rt . Let x° denote 

o ** 

the solution, and define A*? = h.fx 0 )/^ and u ® = r rt /g.(x°). 

i l ~ 0 j 0 j ~ 


Then 
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(x°, A 0 , jU°, r°) solves (2.3) and one can continue the solution from 

r = r Q to r = 0 using numerical continuation techniques. 

Since equation (2.3) is just a perturbation of the Karush-Kuhn-Tucker 

conditions, the system is just as well conditioned as the nonlinear 

programming problem itself, at least for small r. This expanded system is 

not new, for it has been in the literature for sometime and was used by 

Fiacco and McCormick [4] to investigate the behavior of the penalty path 

near r = 0. Numerically, the system (2.3) has a deficiency in that a 

multiplier may tend to infinity when either a constraint cannot be 

satisfied or a constraint qualification fails with the former probably 

occurring more often. The modification that we now introduce avoids this 

problem, allows continuation past r = 0, and provides a homotopy into the 

Fritz John first-order necessary conditions. 

To achieve the modification, we multiply the first equation in (2.3) 

through by a, rewrite rA in the second equation as aR d multiply the 

2 frl 

third equation by a and rewrite otre as <l j^e . Then redefine the 
r 

variables r := — , A := a A , and [J := ag. The parameter a allows one to 
normalize the multipliers to 1, but rather than using the notation a we 
replace it with /i With these modifications the Expanded Lagrangian 

System can be written as 


(2.4) 


where t = u f 
*p+l 


0 = v L 
x 

0 = h + rA 

0 - r s - ^p +1 r JS 

0 = llgll 2 + IIAIIg - 1 

q p 

Wj ' I'V'i “ d e ' iH i’ 

j=l 1=1 



Note that 
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if we drop the normalization and set fi ^ = 1, we are back to the system 
(2.3). Given this formulation we can now state necessary and sufficient 
conditions for the system (2.4) be regular at r = 0. 





and a. .of (z^, 0) and a function <> € C^(a ) such that F(»(rK r) = 0 for 
all r 6 and £(0) = z°. This solution is locally unique in the sense 

that if (z, r) e a g and F(z, r) = 0, then z belongs to the manifold 
defined bv * . 1 . e . . z = *(r). Furthermore, if f. jg and h a££ C k (C°° ar 
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Since this theorem has been established previously [24] in a slightly 
different context, the proof will not be repeated here; however, several 
remarks are in order. If x^ is a Fritz John or Karush-Kuhn-Tucker point, 
condition (a) is called strict complementarity (gj(x^) = 0 implies 
nonzero) while condition (b) is the Mangasarian-Fromowitz constraint 
qualification [21]. Furthermore, if conditions (a) and (b) are satisfied 
and condition (c) is strengthened to the Hessian of the Lagrangian being 
positive definite on the tangent space, then we have a second-order 
sufficient condition for x° to be a local minimum provided (J* > 0 and 

S ± 5 - 

If conditions (a), (b) and (c) are satisfied at (z , 0), then 

* 0 and we might as well use system (2.3) computationally since it is 

simpler. To decide which of the two systems one should use we should 

+ 

start with (2.3) and if the Lagrange multipliers become large as r -» 0 , 
then we would switch to system (2.4). 

This brings us to the case in which the Jacobian D F(z°, 0) is 

singular at r = 0. This situation can be analyzed effectively using 


bifurcation 

and 

singularity theory as 

in, 

for 

example, the book 

of 

Golubitsky 

and 

Schaefer [11]. Note 

that 

the 

singularities can 

be 


classified into seven classes depending on which of the three conditions 
(a), (b) and (c) in Theorem 2.2 are violated. One of these cases, namely 
the situation in which strict complementarity is violated but the other 
two conditions are valid, has been examined by K. Jittorntrum and M. R. 
Osborne [14], under the assumption that the Hessian of the Lagrangian is 
positive definite on the tangent space T. This problem under weaker 
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assumptions and the remaining cases will be treated in forthcoming work. 

Such an analysis is important for the determination of the expected 

behavior and for sensitivity analysis. 

Before closing this section, we should address a major weakness of 

interior penalty methods: the requirement that the interior of 

{x: g(x) > 0} be robust and the requirement of the existence of an 

interior feasible point x° such that g(x°) > 0. Clearly this is not 

always possible. Although we investigate an exterior method in the next 

section as one possible solution, another is to perturb the inequality 

constraints by the parameter r and still use the logarithmic barrier 

function. As an example, suppose g^(x°) < 0 instead of g^x 0 ) > 0. Then 

r 0 

we could modify the constraint g.(x) > 0 to g.(x) - — (g.(x ) - 1) > 0 

1 ~ 1 ~ r 0 1 ~ 

where r^ is an initial value of the penalty parameter r. Then we modify 

the corresponding penalty function from -r ln(g.(x)) to -r ln(g,(x) - 

1 ' x ' 1 ~ 

r 

— (g.(x U )-l)) since g. (x U ) (g. (x )-l) = 1 > 0. An unconstrained 

r 0 1 ~ 1 ~ r o 

optimization technique can still be used to get on the perturbed penalty 
path. 


3. The Quadratic Penalty-Quadratic Loss Function 

In this section we will use the quadratic loss function, an exterior 
method, to handle the inequality constraints. Since the virtues of 
exterior methods have been discussed previously [4], we forego an 
extensive comparison of interior and exterior methods. The essence of 
their advantage is that an initial strictly feasible point is not required 
and that when many inactive inequality constraints are present, the 
resulting problem size is much smaller than that for an interior method. 
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The price to be paid for these advantages is a lack of higher-order 
differentiability; however, when the three conditions of Theorem 2.2 hold, 
the lack of second-order differentiability is only apparent as will be 
shown in Theorem 3.2. 

The quadratic penalty-quadratic loss function used in this section is 

(3.1) P(x, r) = f(x) + ^h^xjhfx) + 2 p(s"(x)) T (jg"(x)) 

- - T 

where g = (g. , .... g ) and g.(x) = min(g.(x), 0). A difficulty with 

A# 1 P 1 ~ 1 ^ 

this penalty function is that the Hessian becomes discontinuous across an 

inequality constraint boundary g.^ = 0. Although this discontinuity is a 

2 

simple finite jump discontinuity, the jump in v P tends to infinity as r 

tends to zero, and the additional and inevitable ill-conditioning is still 
2 

present in v P. In the Expanded Lagrangian System the jump discontinuity 
tends to zero as r tends to zero and the ill-conditioning is removed 1 In 
the presence of a second-order sufficient condition Fiacco and McCormick 
[4] have shown that there is no jump for sufficiently small r. We extend 
this result in Theorem 3.2, but first we derive the Expanded Lagrangian 
System. 


THEOREM .3,1- Let the hypotheses of Theorem 2.1 be satisfie d and let P be 
the penalized objective function defined bv equation (3.1). Then x is a 


minimizer of P provided 


,h. 


w T fi) 


(3.2) 0 = VP := Vf + (Dh) 

where % is defined following equat ion (3.1). Also, (x. r) solves (3.2) 
for r * 0 if and only if (x. A, g, r) solves the Expanded Lagrangian 
System 
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0 = vL 


0 = h + rA 


(3.3) 


where £ = 


0 = g i + ^l r ’ 1 € 

0 = fJ ^ , 1 <£ V(x) 

and V(x) = {i: 1 < i < p, g^x) < 0}, 


We again omit the straightforward proof. The only difficulty is in 
the manipulation of g and its derivative, but this is explained in 
Luenberger [20] . 

Since multipliers may tend to infinity we use instead the equivalent 
normalized problem 


h + Ar 

IV /v» 

(3.4) S(x. *. U. r> i * V '* « v< *> > l _ 0 

Hi (i « V(x)) 

q p+1 

L j=l i=l J 

q P 

where the Lagrangian L = ^ h i A j “ ^ g i^i* This system is 

j=l 1=1 

derived in a fashion similar to the derivation of F = 0 in equation (2.4), 
and the corresponding trajectory analysis is given as follows. 


THEOREM 3.2. 


Let the hypotheses of Theorem 2.2 be satisfied. Then 
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nonsingular. Thus the three conditions (a), (b) and (c) In Theorem (2.2) 
guarantee the nonslneularltv of D^Gfz^. 0). 

i£i (z°, 0) be a solution of £ = 0. suppose D^G(z°. 0) _is 

nonsingular, and modify (3.4) bv replacing V(x) by V(x°). The resulting 
modified equations, denoted bv G°(z. r) = 0. have a solution ( z . r) = 
(J(r), r) with the following properties: There exist neighborhoods. of 
r = 0 and_® 2 of_(z°, 0) such that jl e C 1 ^), G°($(r), r) = 0 for all r € 

and i(0) = z Q . This solution is locally unique in the sense that (z. 

r) € and G°(z. r) = 0 implies z - J( r). Furthermore, if f. g and h are 

k oo k~l oo 

in C (C or real analytic) then ffi is C (C or real analytic) . 

The proof follows directly from the implicit function theorem [2] and 

our previous work [24] and is thus omitted. This theorem gives a smooth 

0 0 

path through the given solution (z , 0) only by fixing V(x ) and thus 

modifying G. In practice we start with a positive r so that equation 

(3.4) implies /u . > 0. If in addition D G(z°, 0) is nonsingular, then 

X 2r~ ^ 

fj® > 0 for i € V(x°) and = 0 otherwise. Thus the eventual active 

inequality constraints approach from the exterior of the feasible region, 

which implies that the solution (z, r) = (j|(r), r) has the stated 

smoothness only for r € (0, <») ft S6. . Although the change in V(x) as r 

passes through zero produces a discontinuity in the path direction, the 

0 

real problem is that removing one index from V(x ) may cause another index 
that ordinarily would have been removed to stay in V. Thus, it is unclear 
how the equations become modified as r crosses zero. If this problem 
could be easily resolved, continuation past r = 0 could be undertaken. On 
the other hand, one can still continue in the positive r direction in 
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hopes of hitting a turning point and return to r = 0. In this sense this 
method still maintains a "global-like" capability. 

4 . Other Penalty Functions 

In this section we show that the three penalty methods described in 
the previous section are essentially the only smooth penalty methods that 
do not introduce artificial singularities and thus ill-conditioning into 
the penalty path for nonsingular nonlinear programming problems. Our 
classification of interior and exterior methods for inequality constraints 
and of equality penalty methods is based on the work of Lootsraa [19]. 

We shall use the customary term barrier function instead of interior 
penalty function. These functions generally have the following properties 
[23]: 

(i) 4 > : -> IR, lim i> = -w, 

u-+0* 

(4.1) (jj) < 0 and 4>" > 0 on I R + , 

where IR + = (0, «>) . For our purposes we need a refinement of these 

properties and use a classification given by F. A. Lootsma [19]. 

Definition 4.1 . A barrier function 4 > : R + -► iR satisfying properties (i) 
and (ii) in equation (4.1) is said to be a barrier function of order a if 
<>’ is real analytic on R + and has a pole of order a at the origin. 

The three most popular barrier functions are 4> = -In u (a = 1) due to 
Frisch (1955), i> = u 1 (a = 2) due to Carroll (1961), and $ = u 2 (a = 3) 
due to Kowalik (1966), Box (1969) and Fletcher and McCann (1969) [19]. 
The principal result is contained in 
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THEOREM 4.1. Let # be a barrier function of order a and let f, g and h be 
^ “ — " ' ~ 1 

2 

C functions . Then « > 1 and If a > 1 . the Expanded Lagrangian System is 
singular at the penalty parameter value r = 0. 


Proof : We first note that a < 1 implies that lim #(u) = -h» is violated. 

u-»0 + 

Thus we restrict ourselves to the case a > 1. We consider, without loss 

of generality, the nonlinear programming problem min(f(x): g(x) > 0} , 

which has the penalized objective function 

P 

P(x, r) = f ( x ) + r X ^*(g 1 (x)) 

i=l 

where the power of r is included so that the minimizer x(r) will be smooth 

in r (Lootsma [19]). A minimizer x(r) must satisfy 

P 

(4.2) 0 = VP := Vf + r a ^#'(gj)vg i 

i=l 

for which the expanded system is 


(4.3) 


0 = V l 
X 

- „„ a 

0 = TG + r e 


where r = diag(/^ /* p ). G = [^ L -y ^T^g-y ] and i = f 


T 

S u- 


Now since # has a pole of order a at the origin, 


»’’(u) 

[*'(u)r 


_ , a-l . 

= 0(u ) as 


*"(u) + 

u -► 0. Thus a > 1 implies — «— -+ 0 as u -► 0 . To complete the proof 

[*'(u>r 

for a > 1, it suffices to consider the case g^(x) -» 0 as r -» 0 either 

th 

smoothly or as a sequence. Now the (n + i) row of the Jacobian of the 

1 


expanded system (4.3) has as the only potential nonzero entries —y 




and 
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-V’’ (g l ) + th 

o°g.» both of which tend to zero as g. -» 0 . Thus, the (n + i) 

[g'tgj)] 1 1 

row of the Jacobian approaches zero at r -4 0 + . (Notice that we could have 


arrived at this same conclusion by extending the definition of 


u = 0 by continuity.) 




t *'] 2 


to 


Q.E.D. 


In case a 


1 , 


the quotient 


»"(u) 

[*'(u)] 2 


tends to a finite nonzero limit 


as u -► 0 + and one can establish the nonsingularity of the Expanded 
Lagrangian System just as in Theorem 2.2. Of course, the canonical 
example from the class of barrier functions of order one is the 
logarithmic barrier function which was examined in Section 2. 


The next class of penalty methods to be considered is the class of 
exterior penalty functions which we call loss functions [4]. We generally 
require that a loss function satisfy the properties 

(iii) ?(u) = 0 for u > 0, ¥(u) > 0 for u < 0, 

(4.4) (iv) TP'(u) < 0 and *>"(u) > 0 for u < 0, 

(v) ip is continuous across u = 0. 

Following Lootsma [19], we further restrict this class of exterior 
methods : 


Definition 4.3 . A loss function IP satisfying (4.4) is said to be of order 
7 > 0 provided iP'(u) is real analytic for u < 0 with a zero of order 7 at 
u = 0, i.e., (u) = o((-u) 7 ) as u-> 0 . 
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Given this definition we can state the principal results for exterior 
methods . 

2 

THEOREM 4.2 . Let f, h be C functions and suppose ?>(u) is a loss 

function .Qf ardsc,:* . I£,,.ei£her-Q < 7 < l .or 7 > i,„ the.. corresponding 

Expanded Lagrangian System is singular at r = 0. 

Proof : It suffices again to consider the problem min{f(x): g(x) > 0} with 

the corresponding penalized objective function 

P 

P(x, r) = f(x) + r 7 ^^(g^x)) 

1=1 

where r is a loss function of order 7 and the power 7 of r is included to 
ensure a smooth dependence of x on r [19]. Then a minimizer of P 
satisfies 

0 = VP := vf + r" 7 (Dg)V'( gl ) *'(g p )) T 

which has the Expanded Lagrangian System 

0 = v t 
x 

(4 ' 5) 0 = OP'tej) *'(g tf )) T - 

T 

where the Lagrangian l = f - g g. If 7 < 1 , then ¥'(u) becomes unbounded 
as u -» 0 . Thus (4.5) is singular in that 7>'(u) becomes unbounded as 
u -► 0 . If 7 > l, then g^ -► 0 as r -► 0 + implies that the (n + i) tla row of 
the Jacobian tends to zero so that the system (4.5) becomes singular. 

Q.E.D. 

For the case 7 =1, one can again prove a result similar to that 
obtained in Section 3 for the canonical quadratic loss function. Finally 
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we come to the penalty functions for equality constraints. In analogy 
with our two previous definitions we define penalty functions for equality 
constraints as follows. 

Definition 4.4 . Let 9: (R -► IR + U {0}. We say that 9 is a penalty function 

of order fi t provided 9*' > 0, 9* <0 for u < 0 and 9 f > 0 for u > 0, 9 is 

analytic on IR - {0} and 9 f has a zero of order fl at u = 0, i.e. 

9 = 0( |u|^) as u 0. 

For this class of penalty methods, the corresponding result is 
contained in 

THEOREM 4.3 . Let 9 be a penalty function for equality constraints of 
order O . Then the use of e to incorporate equality constraints into the 
penalized objective function yields a corresponding singular Expanded 

Lagrangian System if & < 1 jor fi > 1 . 

We omit the proof of this theorem since it closely parallels the 

previous proof. If p = 1, one can prove a theorem corresponding to those 

in the previous two sections. The canonical order-one method is the 

2 

quadratic penalty function 0(u) = u . 

In conclusion, any of the order-one methods lead to well-conditioned 
Expanded Lagrangian Systems, and the canonical examples of these order-one 
penalty functions are the quadratic penalty function for equality 
constraints, the logarithmic barrier function, an interior method, and the 
quadratic-loss function, an exterior method, for inequality constraints. 
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5. Application to Other Optimization Examples 

The three penalty functions described in the previous sections form 
the basis for the derivation of the Expanded Lagrangian System (ELS). 
There is, however, one important variant. When equality affine 
constraints are present, one frequently leaves them out of the penalized 
objective function. In this case the first-order necessary conditions are 
used to derive the ELS. Although these techniques apply equally well to 
constrained optimization problems in the calculus of variations including 
variational formulations of partial differential equations, optimal 
control, and parameter identification, we present only two additional 
examples — one from linear programming and one from the calculus of 
variations. The former is chosen because of the current interest in the 
Karmarkar algorithm [15] and the latter, because it is an infinite- 
dimensional problem. 

In linear programming many researchers [8] have observed the apparent 
unsatisfactory feature that the simplex method traverses the boundary of 
the feasible region. From the outset attempts were made to cross the 
interior of the feasible region in hopes of producing a faster algorithm, 
but none of these attempts have succeeded, except possibly for the 
recently (1984) published Karmarkar algorithm [15]. This algorithm is 
claimed to be faster than the simplex method, and although these claims 
have been controversial at times, the idea of following a smooth path to 
optimality has always been an appealing one. A similar class of 
algorithms can be based on the use of the three penalty methods of the 
last three sections. As an example we next derive the Expanded Lagrangian 
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Systems for an LP problem In standard form using the logarithmic barrier 
function. 

A standard form of the LP problem with corresponding first-order 
necessary conditions is 


(5.1) 


LP Problem 
Min c^x 


Karush-Kuhn-Tucker Conditions 
T 

* * % 


s.t. Ax - b = 0 
x > 0 


C - A A - g = 0 
Ax - b = 0, x > 0, ju > 0 

rx = o, r = diag^j ^ n ). 

If the logarithmic barrier function is used for the inequality constraints 
and the affine equality constraints are left out of the penalized 
objective function, then we have the following problem formulation and 
corresponding first-order necessary conditions: 

Log Barrier Formulation Karush-Kuhn-Tucker Conditions 

(5.2) Min c T x - r^ln(x.) c - A T A - [JL = 0 


s.t. Ax - b = 0 


Ax - b = 0. 


From the KKT conditions we identify fj. = — (i - 1, . .., n) and expand the 

i x i 

system to obtain the 

Expanded Lagrangian System 
c = A T A - u = 0 
(5.3) Ax - b = 0 


rx - re = 0, r = diag(/i u ) 

where e is a vector in IR n containing a one (1) in each entry. We observe 
again that the ELS is a simple perturbation of the KKT conditions for the 
original problem. The normalization used for equation (2.4) is not needed 
here since we start the algorithm with an x° satisfying Ax° = b and x° > 0 
and a constraint qualification is not required for an LP problem. 
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The problem formulation 5.2 is the same as that used by Gill, Murray, 
Saunders, Tomlin and Wright [8] to show a formal equivalence between their 
projected Newton barrier method and the Karmarkar algorithm [15]. One of 
the deficiencies of this particular formulation is the requirement that 
one must first obtain an / > 0 satisfying Ax° - b = 0. In some cases 
this requires half of the total computational effort [8] , but there are 
techniques that can be used to alleviate this problem. One could consider 
using a quadratic penalty function for the equality affine constraints, 
the quadratic loss function for the inequality constraints, or a 
perturbation technique similar to that discussed at the end of Section 2. 
In any event, the same least squares technique used in the Karmarkar 
algorithm [15] and the projected Newton barrier method [8] can be used to 
solve the linear algebra problem that arises in the solve phase of the 
predictor-solver numerical continuation techniques [16, 25]. Furthermore, 
the use of higher-order predictors adds significantly to the speed of the 
method . 

Finally, we consider an isoperimetric problem from the calculus of 
variations [7 , p. 42]. The problem is 


(5 


fb pb 

.4) Min{J[y] := F(x, y, y' )dx|K[y] := G(x, y, y' )dx = «} 


where the class of admissible functions is (y e C [a, b] : y(a) = A and 
y(b) = B> . Assuming that F and G are smooth, the first-order necessary 
condition for the existence of a solution y is that if y is not an 
extremal of K[y] (a constraint qualification), then there exists a 


constant A such that y is an extremal of (F + AG)dx satisfying K[y] = € , 

■'a 


i.e 


• » 
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(5.5) 


f p y-^y' ♦‘‘Vd^V ’ 0 

K[y] - e = 0 


Ly(a) = A, y(b) = B. 

To derive the Expanded Lagrangian System for (5.4) we incorporate the 
equality constraint into a quadratic penalty function and define 

(5.6) L[y] = J(y] + ^l(K[y] - ( ) 2 

o 

and minimize this functional over (y € C [a, b] : y(a) = A, y(b) = B) . The 
first variation of L along with the definition A = ( K [ y ] - €)/r leads to 
the Expanded Lagrangian System 


(5.7) 


fF - _1 f + a(G - J1g , ) = 0 

y dx y' y dx y" 

: K[y] - t - rA = 0 


(.y(a) = A, y(b) = B. 

Again we observe that these equations are a perturbation of the 
first-order necessary conditions given in (5.5). 

The numerical procedure to be used starts with a discretization of 
(5.6) possibly using a coarse grid. Then an unconstrained code is used to 
minimize this discrete problem. Using the discrete version of 
A = (K[y]-£ )/r, we switch to (5.7) and use numerical continuation techniques 
to trace the solution to r=0. Of course, one of the advantages is that we 
could refine the discretization as r -♦ 0 + , say by interpolation. 

Furthermore, we could continue past r=0 and investigate the possibility of 
multiple minima by hitting limit points and returning to r = 0 on 
different branches of the solution curves. In practice, we again stress 
that A could tend to infinity because either y is an extremal of K[y] or 
the constraint K[y] = € cannot be satisfied. In this case the Fritz John 
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type formulation similar to that developed In Sections 2 and 3 could be 
used. 


6. Discussion and Conclusions 

One of the objectives in this work has been to re-examine the use of 
smooth penalty functions as an effective computational technique in light 
of recent developments in numerical continuation techniques. The Expanded 
Lagrangian System, which has been a focal point of this work, is the key 
link between these methods as well as the key to the removal of the 
ill-conditioning traditionally associated with smooth penalty methods. 
This parametrized system of nonlinear equations with the penalty parameter 
being the parameter is well conditioned only for three smooth penalty 
methods, each of order one as defined in Section 4. The canonical 
examples are the quadratic penalty function for equality constraints, 
logarithmic barrier function, an interior method, and the quadratic loss 
function, an exterior method, for inequality constraints. 

Although the algorithm that makes use of the ELS has many variants, 
the essential features can easily be described. In the first phase one 
uses a linearly constrained or unconstrained optimization technique to get 
on the penalty path for a value of the penalty parameter where the Hessian 
of the penalty function is well conditioned. Rather than following the 
penalty path using sequential unconstrained optimization techniques, which 
leads to ill-conditioning, we switch to the Expanded Lagrangian System and 
use numerical continuation techniques. There are at least two strategies 
for continuing in r. The first is to continue to r = 0, and hopefully 
optimality, as quickly as possible. A second strategy is to investigate 
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the solution set for r varying over some range [A, B] where A < 0 < B in 
hopes of reaching multiple optima. This latter strategy yields a global 
optimization technique and is a significant additional advantage of the 
method. 

The speed of this algorithm is highly dependent upon the 
unconstrained optimization techniques, the efficiency of the predictors 
and the linear algebra in the corrector or solve phase. To get on the 
penalty path initially, a quasi-Newton method using a BFGS update and an 
inexact line search is usually recommended for medium-size problems [3]; 
however, the preconditioned conjugate gradients are also popular for 
large-scale problems arising in optimal control, calculus of variations 
and parameter identification [10, 12]. Once on the penalty path the 
efficiency of the predictor-solve/corrector continuation technique to 
traverse the curve is of primary importance. 

A currently popular and acceptable predictor technique is Euler's 
method with error and stepsize control [25]. The philosophy is to use the 
predictor to get within the domain of contraction of Newton's method and 
then iterate to convergence. This domain is reached by specifying an 
error in the predicted value and then choosing a stepsize to achieve this 
error. Generally, one can use a higher-order predictor such as an 
Adams-Bashforth predictor [26, 27] and obtain the same accuracy with a 
much larger stepsize. This efficient stepping along the curve can add 
significantly to the speed of the algorithm and will be investigated in a 
forthcoming work. 

The linear algebra involved in the correction back to the curve is 
essentially that of Newton's method or one of its many variants [3, 10]. 
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The use of highly accurate predictors and relatively large stepsizes 
considerably reduces the number of matrix decompositions. To be sure, 
this aspect of the problem needs extensive investigation since the most 
efficient methods are yet to be determined. However, for the linear 

programming problem of Section 5, the same least squares used by Karmarkar 
[15] and Gill, Murray, Saunders, Tomlin and Wright [8] can be used in the 
solve phase. 

Once on the penalty path one continues to track a minimum unless an 
eigenvalue changes its sign. Such a change generically corresponds to a 
bifurcation of the penalty path, and the minimum may be lost or it may 
persist locally. Although bifurcation techniques for branch switching 
[16, 25] can also be used, the frequency of this bifurcation in the 
continuation of the penalty path from r = 0.1 or 0.01 to 0.0 is unclear. 
However, it most assuredly occurs when the continuation is over a large 
interval [A, B] . 

Since in the continuation of the penalty path from r = 10 1 to r = 0 
the parameter r is small, a practical approach to the bifurcation problem 
as well as the intimately related sensitivity problem is to investigate 
the bifurcation behavior of the Expanded Lagrangian System for the 
nonlinear parametric programming problem 

min<f(x, a) : h(x, a) = 0, £(x, a) > 0} 
where a € iR m is a vector of parameters in the case that the Jacobian 
D z F(z°, <*°, 0) is singular. Perturbed bifurcation theory [11] is a 

possible tool for answering this question. The behavior is important from 
a computational viewpoint since one would like to know the expected 


behavior and difficulties. 
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